rank | frequency | n-gram |
---|---|---|
1 | 11627 | -n |
2 | 8648 | -a |
3 | 5245 | -” |
4 | 4791 | -i |
5 | 3039 | -h |
rank | frequency | n-gram |
---|---|---|
1 | 9101 | -an |
2 | 3627 | -ya |
3 | 3250 | -,” |
4 | 2445 | -ng |
5 | 2106 | -ah |
rank | frequency | n-gram |
---|---|---|
1 | 3277 | -nya |
2 | 3146 | -kan |
3 | 1127 | -ang |
4 | 788 | -n,” |
5 | 634 | -gan |
rank | frequency | n-gram |
---|---|---|
1 | 1158 | -nnya |
2 | 699 | -an,” |
3 | 559 | -ngan |
4 | 531 | -akan |
5 | 435 | -ikan |
rank | frequency | n-gram |
---|---|---|
1 | 1076 | -annya |
2 | 314 | -angan |
3 | 249 | -nya,” |
4 | 238 | -aysia |
5 | 235 | -ngkan |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings